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Abstract 

For the nnajority of organs in developing vertebrate ennbryos, left-right 
asynnnnetry is controlled by a ciliated region; the left-right organizer node in the 
nnouse and hunnan, and the Kuppfer's vesicle in the zebrafish. In the zebrafish, 
laterality cues fronn the Kuppfer's vesicle deternnine asynnnnetry in the 
developing heart, the direction of 'heart jogging' and the direction of 'heart 
looping'. 'Heart jogging' is the ternn given to the process by which the 
synnnnetrical zebrafish heart tube is displaced relative to the dorsal nnidline, with 
a leftward 'jog'. Heart jogging is not considered to occur in nnannnnals, although 
a leftward shift of the developing nnouse caudal heart does occur prior to 
looping, which nnay be analogous to zebrafish heart jogging. Previous studies 
have characterized 30 genes involved in zebrafish heart jogging, the nnajority of 
which have well defined orthologs in nnouse and hunnan and nnany of these 
orthologs have been associated with early nnannnnalian heart developnnent. 

We undertook nnanual curation of a specific set of genes associated with heart 
developnnent and we describe the use of Gene Ontology ternn enrichnnent 
analyses to exannine the cellular processes associated with heart jogging. We 
found that the hunnan, nnouse and zebrafish 'heart jogging orthologs' are 
involved in sinnilar organ developnnental processes across the three species, 
such as heart, kidney and nervous systenn developnnent, as well as nnore 
specific cellular processes such as ciliunn developnnent and function. The 
results of these analyses are consistent with a role for cilia in the deternnination 
of left-right asynnnnetry of nnany internal organs, in addition to their known role in 
zebrafish heart jogging. 

This study highlights the innportance of nnodel organisnns in the study of hunnan 
heart developnnent, and ennphasises both the conservation and divergence of 
developnnental processes across vertebrates, as well as the linnitations of this 
approach. 
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Amendments from Version 1 

The main changes from the previous version were included to 
address comments from the reviewers: 

The methods subsection - Generation of the list of zebrafish 
jogging genes section: edited to make it clearer how the list of 
zebrafish jogging genes was generated and the limitation of the 
approach taken to create a list of 'heart jogging genes'. 

In Discussion - paragraph 4: edited to create more positive 
statements about the role of the 'jogging ortholog' genes, in human 
heart development. 

See referee reports 



Introduction 

An understanding of heart development is important for the treat- 
ment of both congenital and acquired heart disease. The majority 
of heart development studies use model organisms for ethical and 
practical reasons. Transparent fish embryos, as well chick embryos, 
enable the developing heart to be studied in real time^ and the 
mouse continues to be a key model organism used to investigate 
mammalian heart development-. Although there is substantial evo- 
lutionary conservation in the development of left-right axis asym- 
metry, there is divergence between species \ The earliest events in 
mammalian heart development are of great interest, but are poorly 
understood relative to externally developing organs, due to practical 
constraints. 

For the majority of developing vertebrate embryos left-right asym- 
metry is controlled by a ciliated region; the left-right organizer 
node in the mouse and human, and the Kuppfer's vesicle in the 
zebrafish^'\ In the zebrafish, laterality cues from the Kuppfer's 
vesicle determine asymmetry in the developing heart, and con- 
sequently the direction of heart jogging and heart looping. At 
24 hours post-fertilization (hpf) the symmetrical zebrafish heart 
tube is displaced relative to the dorsal midline, with a leftward 'jog' . 
At 36hpf the heart tube then loops to the right to create the asym- 
metric heart^'^\ Cilia within the Kuppfer's vesicle are known to be 
instrumental in establishing left-right asymmetry and consequently 
play a significant role in determining the direction of heart jogging^ 
and heart looping^ However, a failure of heart jogging does not nec- 
essarily imply that there will be a failure in heart looping, and vice 
versa. In addition, asymmetric cell migration has been implicated 
as a key factor in the process of heart jogging*^"^\ Several of the 
genes involved in zebrafish heart jogging have been identified from 
mutation, morpholino and functional complementation studies^'' 

We sought to determine whether the use of Gene Ontology (GO) 
annotation could offer mechanistic clues to early mammalian 
heart development. GO is a controlled vocabulary that is used to 
describe gene product function-^. GO describes three aspects of a 
gene product's biology: the biological process that the gene product 
is involved in, the specific molecular function of the gene product 
and the cellular component that the gene product is located in. GO 
terms are associated in a directed acyclic graph (DAG), and thus 
have defined relationships to each other. 

The process of heart looping has been described in a variety of 
higher eukaryotes-'^^'^'^, and the occurrence of dextral-looping, the 



early phase of heart looping, appears to be conserved from zebrafish 
to chicken to humans. In addition, many congenital heart abnor- 
malities, such as dextrocardia and isomerisms are thought to be 
due to abnormal heart looping- '" and ciliary dysfunction has been 
associated with 50% of patients with congenital heart disease and 
heterotaxy^^ However, the process of heart jogging has only been 
described in zebrafish^. Biben and Harvey describe a leftward shift 
of the developing mouse caudal heart prior to looping, which may 
be analogous to heart jogging in zebrafish-^ but to our knowledge 
this has not been investigated further, and heart jogging is not con- 
sidered to occur in mammals. Consequently, when the ontology 
describing heart development was expanded^-, limitations were 
included to prevent the association of the GO term 'heart jogging' 
to mammalian gene products''. However, an absence of evidence 
is not evidence of absence, hence it remains a possibility that heart 
jogging also occurs in mammalian systems. 

Although there has been substantial progress in heart development 
research^ there are clearly gaps in our understanding of early 
heart development, particularly in the mammal. Functional enrich- 
ment analysis of genes known to be involved in zebrafish heart jog- 
ging, and also of the human and mouse orthologs of these zebrafish 
heart jogging genes, identifies many conserved biological processes, 
functions and cellular locations across these three species. The 
results of these analyses support the role of cilia in symmetry break- 
ing and the importance of cell signalling in early heart development. 

Methods 

Generation of the list of zebrafish heart jogging genes 

A list of 30 zebrafish genes that affect heart jogging was compiled 
using a variety of approaches.Twelve zebrafish proteins were identi- 
fied as they were already annotated to the 'heart jogging' GO terms, 
the remaining 18 proteins were then identified using the ZFIN 
(http://zfin.org/) Site Search, with the search phrase 'heart jogging', 
and filtering using the 'Expression/Phenotypes' category. This 
search retrieves figures from papers that have 'heart jogging' in the 
figure legend, and thus are likely to be describing specific zebrafish 
genes (and proteins) involved in this process. Many of these genes 
had not yet been curated with GO terms. Each of the papers identi- 
fied in this way were reviewed; of the 23 zebrafish genes identified 
in these papers five (Bmprlaa, Tbxl, unm_hull9, unm_hu202, 
unm_hu304) were eliminated, as none of these papers provided 
experimental evidence for the involvement of these genes in heart 
jogging. This left 30 zebrafish proteins with strong evidence for a 
role in the heart jogging process (Table 1). The experimental evi- 
dence describing the association of each gene to the process of heart 
jogging was manually reviewed, to ensure consistent criteria were 
applied. 

Generation of the list of human and mouse 'jogging ortholog' 
genes 

The HUGO Gene Nomenclature Committee Comparison of Orthol- 
ogy Predictions (HCOP) search tool (http://www.genenames.org/ 
cgi-bin/hcop.pl) was used to identify the closest possible human 
and mouse ortholog for each of the 30 zebrafish genes. HCOP dis- 
plays predictions from 11 homology prediction tools, including 
EnsemblCompara, Homologene and Inparanoid^^. For all but one 
gene, southpaw, HCOP retumed human or mouse homologs for the 
zebrafish genes. The lack of a close mammalian ortholog of southpaw 
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Table 1. Proteins included in zebrafish 'jogging' gene list and the human 
and mouse 'jogging ortholog' gene lists. The evidence for these 30 zebrafish 
proteins having a role in heart jogging comes from mutant, morpholino or functional 
complementation studies, as described in the associated publications. 



Zebrafish gene symbol 
(protein ID) 


Human gene symbol 
(protein ID) 


Mouse gene symbol ] 
(protein ID) J 


acvrW (Q9DGI6) 


ACVRL1 {P37023) 


/\c\/r/7 (061288) \ 


apc^^ (F1QN37) 


APC (P25054) 


Ape (061315) 


bmpA^''^ (057574) 


BMP4(P12644) 


eA7?p4(P21275) 


bmpYa'^ (Q9PTF9) 


eMP7(P18075) 


eA7?p7(P23359) 


bmpr2a^' (Q288P3) 


LJIVII 1 IC. \^Vj< I OU / \J j 




bmpr2b^^ (Q288P2) 


camk2a'' (Q32PV2) 


CAMK2A (Q9UQM7) 


Camk2a{P^^193) 


camk2b2'' (E7^0^2) 


CAMK2B (Q^ 3554) 


Camk2b (P28652) 


camk2gV' (QA\J9PQ) 


CAMK2G (Q 13555) 


Camk2g (0923T9) 


ccdcfOS^ (Q6DGB6) 


CCDC103{Q8\\N40) 


Ccc/c:/ 03 (09D9P2) 


ccdcACP (Q56A40) 


CCDC40 (Q4G0X9) 


Cccyc40 (08BI79) 


co^P (11 X3U9) 


COBL (075128) 


Coi9/ (05NBX1) 


dandb'"^ (Q76C29) 


DAND5 (Q8N907) 


DandS (Q76Lm) 


dnaafr''"^ (Q7ZV84) 


DNAAF1 (Q8NEP3) 


Dnaafi (09D2H9) 


dub^^ (Q0P484) 


RCSD1 (Q6JBY9) 


Rcsd1 (03UZA1) 


fgfr2'^ (Q8JG38) 


FGFP2(P21802) 


Fgfr2(P21803) 


fox/?)6'58 (Q9I9E1) 


F0XH1 (075593) 


Foxh1 (088621) 


foxjla^'^ (Q08CI2) 


F0XJ1 (092949) 


FoxJ1 (061660) 


foxyf/325(p^R8Z9) 


(Q90YL7) 


FZD2 (01 4332) 


Fzcy2 (09JIP6) 


gsk3b'' (Q9IBD2) 


GS/<3e (P49841) 


Gsk3b (O9WV60) 


/7as2^3(Q9Q(34^) 


HAS2 (Q928^9) 


Has2(P70312) 


Irrce" (B3DH20) 


LRRC6 (Q86X45) 


Lrrc6 (088978) 


nipbla^' (F5HSE3) 


NIPBL (06KC79) 


Nipbl (Q6KCD5) 


A//p/^/^2i (F1QBY1) 


nkdl^"" (Q2TJA6) 


NKD1 (0969G9) 


Nkd1 (099MH6) 


nphp:^"^ (P0CI65) 


NPHP3 (07Z494) 


Nphp3 (07TNH6) 


pkd2^ (Q6IVV8) 


(01 3563) 


P/ccy2 (035245) 


ptpn11a'HQ7Z\NM) 


PrPA/n (006124) 


Pfpnn (P35235) 


southpaw'^''^ (Q7ZZT5) 


no mammalian orthologs 



was confirmed with a UCSC BLAT analysis against the human and 
mouse genomes^^ BLAST analysis^^' showed that the closest pos- 
sible human and mouse homolog for the zebrafish southpaw gene 
was Nodal (33% identity). Indeed, both southpaw and nodal are 
specifically expressed in the left lateral plate mesoderm^' and 
knockdown of murine Nodal in this region leads to a disruption 
of cardiac asymmetry, as does injection of southpaw morpholinos, 
suggesting a functional orthology between southpaw and NodaP'^\ 
However a reciprocal HCOP search showed that the zebrafish 
genes nodal-related 1 and 2 are the closest orthologs of human 
NODAL. Hence we have not included a human or mouse ortholog 
for zebrafish southpaw (Table 1). Three pairs of zebrafish paral- 
ogs (bmpr2a/bmpr2b; foxjla/foxjlb; nipblalnipblb) have a single 
corresponding ortholog in human and mouse. Therefore, there are 



26 human and 26 mouse orthologs to the 30 zebrafish genes identi- 
fied as relevant to zebrafish heart jogging (Table 1). 

Gene ontology annotation 

The human 'jogging ortholog' genes were fully manually annotated, 
by an experienced GO curator Individual PubMed queries were 
run for each gene using the approved human gene symbol and filter- 
ing on 'human' . To achieve full annotation, all of the relevant pub- 
lications (a total of 232) containing unique functional data for each 
gene were annotated, regardless of the specific biology described 
in each paper. This approach enabled consistent annotation of all 
experimental data relating to each gene, thus ensuring an unbiased 
overview of any common processes associated with these genes. In 
addition, the GO term 'heart looping' was associated with a 'jogging 
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ortholog' human gene if dextrocardia or situs inversus totalis phe- 
notypes had been associated with a mutation in the gene, in order 
to follow the generally agreed view that leftward heart looping will 
have resulted in these phenotypes^. 

Functional enrichment analysis 

The Mouse Genome Informatics functional enrichment tool VLAD 
(VisuaL Annotation Display; http://proto.informatics.jax.org/pro- 
totypes/vlad-1.0.3/) was used to look for overrepresentation of GO 
terms in each gene list relative to the whole genome of the organ- 
ism. The annotation datasets used for the analysis were zfin (4th 
March 2013), goa_human (5th March 2013) and mgi (7th March 
2013) for the zebrafish, human and mouse analyses respectively, 
and the ontology dataset used was dated 10th March 2013. The query 
gene lists (as UniProt IDs) were pasted into the 'Query Set' field, the 
'Universe Set' field was left blank (to specify all genes in species spe- 
cific annotation file) and the 'Display Settings' options selected were 
'pruning threshold' : 3 and 'collapsing threshold' : 6. No evidence codes 
were excluded from the analyses. For this analysis the total num- 
ber of genes (universe set size) having annotations in the biological 
process ontology were 14,577, 30,441 and 24,813 for zebrafish, 
human and mouse respectively. In line with common practice, when 
using functional analysis tools, enriched GO terms with 1 or 2 asso- 
ciated query genes were excluded from the final results table. 

Creation of an 'early heart development' mouse gene list 

A list of 103 mouse genes likely to play a role in early heart 
development was created by combining gene lists derived from 
three sources: The Mouse Genome Informatics Mammalian Phe- 
notype Ontology browser http://www.informatics.jax.org/searches/ 
MP_form.shtmP'^, the QuickGO browser http://www.ebi.ac.uk/ 
QuickGO/^^ and the 'jogging ortholog' gene list described above 
(see Mousegenelist.es V in Data File). The Mammalian Phenotype 
Ontology browser was queried for genotypes annotated with the 
terms 'abnormal direction of heart looping', 'situs inversus totalis', 
'dextrocardia' and 'mesocardia', creating a list of 180 genotypes 
with an associated gene. Due to the multiple phenotypes associated 
with each of these genotypes only 58 genes were identified through 
this approach, and of these only 5 overlap with the 26 'jogging 
ortholog' genes. Thirty-five genes were identified by filtering on the 
GO term 'determination of heart left/right asymmetry' and its child 
terms, the evidence code IMP (Inferred by Mutant Phenotype), and 
the mouse taxon. Of these only two are also present in the 'jog- 
ging ortholog' gene lists and 11 are present in the phenotype gene 
list. Twenty-six mouse 'jogging ortholog' genes were added to this 
combined gene list, and any duplicated genes were removed. 

Results 

Annotation of the zebrafish heart jogging genes and the 
human 'jogging ortholog' genes 

Thirty zebrafish genes were annotated to the GO term 'heart jog- 
ging' or one of its child terms based on experimental data from 
the literature (Table 1). Human and mouse orthologs of these genes 
were identified, as described in the Methods section, resulting in a 
list of 26 mammalian 'jogging orthologs'. 

The human 'jogging ortholog' genes were then fully annotated 
with GO terms based on published experimental data. All manual 



annotations to the human, mouse and zebrafish genes can be visual- 
ized with the QuickGO Gene Ontology browser http://tinyurl.com/ 
humanortholog, http://tinyurl.com/mouseortholog and http:// 
tinyurl.com/zebrafishgenes. 

Functional enrichment analysis 

The zebrafish heart jogging gene list and the human and mouse 
'jogging ortholog' gene lists were analysed using the VLAD 
enrichment tool. This identified 155 biological process GO terms 
that were significantly enriched in the zebrafish (see Human_data. 
CSV in Data File), 431 in the human (see Human_data.csv in 
Data File) and 402 in the mouse (see Mouse_data.csv in Data File) 
gene lists. The enriched GO terms from all three species were 
grouped into five biological areas: Development, Patteming, Cellu- 
lar Process, Signalling and Movement. The relative enrichment of key 
GO terms from each area was compared across all three species (see 
Biological_process_summary.csv in Data File; summarized in Table 2). 

Enrichment of heart development terms. As expected there was a 
significant enrichment of developmental process terms in all three 
gene lists, including an enrichment of the GO term 'heart develop- 
ment' . However, there was also enrichment of terms such as 'renal 
system development' and 'nervous system development' , indicating 
the role of these proteins in regulating the development of a range of 
organ systems and tissues. These data analyses also show an enrich- 
ment of terms describing specific, but universal, cellular processes, 
such as signalling and regulation of transcription (Table 2). These 
terms represent essential aspects of development, but are grouped 
discretely due to their roles in many other biological processes. 

'Pattern specification', described in GO as a 'developmental process 
that results in the creation of defined areas or spaces within an organ- 
ism to which cells respond and eventually are instructed to differ- 
entiate' and several of its more specific child terms (such as 'speci- 
fication of symmetry'), were also enriched in all three gene lists. 
Within the symmetry ontology, the GO term 'determination of heart 
left/right asymmetry' is annotated to all 30 genes in the zebrafish 
jogging gene list, however, it is only associated with 8 and 4 jog- 
ging ortholog genes in the human and mouse respectively. Of the 
97 zebrafish genes associated with 'determination of heart left/right 
asymmetry' 31% are also present in the zebrafish jogging gene list. 
In contrast, only 1 1% of the human and 8% of the mouse genes asso- 
ciated with this term are also 'jogging orthologs'. These results con- 
firm an overlap in the functional role of the zebrafish jogging genes 
and the human and mouse orthologs in the determination of heart 
left/right symmetry. However, this relatively low level of overlap 
may reflect the limitations of model organism and human research in 
this area, rather than a lack of functional conservation of these genes. 

In addition, there were some differences in the developmental 
terms that were enriched between species. For example, the GO 
terms 'vasculature development' and 'sensory organ development' 
are enriched in both the human and mouse 'jogging ortholog' gene 
fists (Table 2), but neither of these processes are enriched in the 
zebrafish jogging ortholog genes. This difference may reflect the 
type of experiments zebrafish are used for, rather than reflecting a 
difference between zebrafish and mammals in the genes required 
for these developmental processes. 
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Table 2. Comparison of enriched Gene Ontology terms across orthologous gene lists from zebrafish, human and 
mouse. The enriched GO terms were grouped into specific ontology areas, with a selection of more specific child term 
(preceded with a dash) also included. The full list of grouped GO terms can be found in Table S4, which also shows the genes 
annotated to each term from each of the three species, k: the number of genes in each gene list annotated to the GO term; M: 
the number of genes in the species proteome annotated to the GO term. 



Gene Ontology terms ^^^^ 




^ Zebrafish | 




^ Human 




^ Mouse 


k 


"m 


k/M as % 


k 


M 


k/M as % 


k 


M 


k/Mas%^ 


DEVELOPMENT 


GO:0032502 developmental process 


30 


2357 


1 .3% 


24 


6803 


0.4% 


22 


3945 


0.6% 


- 00:0009888 tissue development 


30 


673 


4.5% 


17 


1849 


0.9% 


14 


1138 


1.2% 


- 00:0072358 cardiovascular system development 


30 


487 


6.2% 


15 


1095 


1 .4% 


10 


679 


1.5% 


- GO:0001944 vasculature development 








7 


666 


1.1% 


7 


436 


1.6% 


- GO:0007507 heart development 


30 


268 


11.2% 


13 


634 


2.1% 


8 


390 


2.1% 


- GO:0001947 heart looping 


22 


83 


26.5% 


7 


68 


10.3% 








- GO:0007399 nervous system development 


8 


700 


1.1% 


15 


2802 


0.5% 


12 


1486 


0.8% 


- GO:0072001 renal system development 


6 


100 


6.0% 


9 


386 


2.3% 


6 


200 


3.0% 


- GO:0007423 sensory organ development 








10 


738 


1 .4% 


8 


480 


1.7% 


- 00:0048736 appendage development 


4 


123 


3.3% 


4 


237 


1.7% 


4 


159 


2.5% 


- 00:0050793 regulation of developmental process 


6 


267 


2.2% 


14 


2383 


0.6% 


14 


1542 


0.9% 


PATTERNING 


00:0007389 pattern specification process 


30 


435 


6.9% 


15 


644 


2.3% 


12 


414 


2.9% 


- 00:0009799 specification of symmetry 


30 


177 


16.9% 


10 


147 


6.8% 


7 


90 


7.8% 


- 00:0061371 determination of heart left/right 
asymmetry 


30 


97 


30.9% 


8 


71 


1 1 .3% 


4 


48 


8.3% 


CELLULAR PROCESS 


00:0071840 cellular component organization or 
biogenesis 


12 


1124 


1.1% 


17 


5991 


0.3% 


15 


3306 


0.5% 


- 00:0030030 cell projection organization 


10 


274 


3.6% 


12 


1311 


0.9% 


11 


644 


1.7% 


- 00:0051 128 regulation of cellular component 
organization 








10 


1894 


0.5% 


10 


1293 


0.8% 


- 00:0031344 regulation of cell projection 
organization 








5 


452 


1.1% 


5 


311 


1.6% 


00:0006468 protein phosphorylation 


8 


696 


1.1% 


8 


842 


1.0% 


9 


748 


1.2% 


- 00:0001932 regulation of protein phosphorylation 








8 


1124 


0.7% 


8 


722 


1.1% 


00:0006357 regulation of transcription from RNA 
polymerase II promoter 


8 


166 


4.8% 


10 


1939 


0.5% 


12 


1268 


0.9% 


00:0042127 regulation of cell proliferation 


5 


70 


7.1% 


9 


1769 


0.5% 


9 


1123 


0.8% 


00:0007049 cell cycle 








9 


1644 


0.5% 


8 


915 


0.9% 


- 00:0051726 regulation of cell cycle 








6 


934 


0.6% 


6 


538 


1.1% 


SIGNALLING 


00:0023052 signaling 


12 


2131 


0.6% 


16 


6682 


0.2% 


16 


4568 


0.4% 


- 00:0023051 regulation of signaling 


7 


521 


1 .3% 


16 


3227 


0.5% 


16 


1931 


0.8% 


00:0050896 response to stimulus 








20 


10767 


0.2% 


21 


6679 


0.3% 


- 00:0048583 regulation of response to stimulus 


7 


557 


1 .3% 


15 


3799 


0.4% 


15 


2140 


0.7% 


MOVEMENT 


00:0007017 microtubule-based process 


6 


188 


3.2% 


6 


873 


0.7% 


4 


358 


1.1% 


00:0040011 locomotion 


10 


289 


3.5% 


10 


1448 


0.7% 


8 


753 


1.1% 


- 00:0040012 regulation of locomotion 








6 


657 


0.9% 


5 


473 


1.1% 


- 00:2000145 regulation of cell motility 








6 


608 


1.0% 


5 


439 


1.1% 


- 00:0016477 cell migration 


8 


200 


4.0% 


6 


931 


3.6% 


5 


504 


1.0% 
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Enrichment of cilia terms. Terms in the cellular component organi- 
zation or biogenesis ontology were enriched across all three gene 
lists (Table 2 and Biological_process_summary.csv in Data File). 
Specifically there was an enrichment of terms describing 'cilium 
morphogenesis' and 'protein complex assembly' (Figure 1). Within 
each of these, some more specific terms were enriched, for example 
the human and mouse 'jogging ortholog' gene lists were enriched 
for the term 'axonemal dynein complex assembly', whilst the 
zebrafish and human gene lists showed an enrichment of the term 
'cilium assembly'. 

Terms such as 'regulation of cell projection organization' were also 
enriched in the human and mouse 'jogging ortholog' gene lists. 'Reg- 
ulation' terms have a 'regulates' relationship with the relevant pro- 
cesses; for example the term 'positive regulation of cell projection 
organization' has a 'positively_regulates' relationship to the term 
'cell projection organization' . In GO an important benefit of building 



a DAG, rather than a flat-list of controlled vocabulary terms, is 
that relationships can be used to make inferences from one term to 
another. However, the VLAD enrichment tool does not automatically 
create a transitive relationship between 'regulation' terms and the pro- 
cesses or functions they regulate. Consequently genes annotated to a 
'regulation' term will not be associated with the regulated process 
term (unless there is an independent annotation to the process term). 
It is also important to recognise that it can be difficult for a curator to 
choose between annotating to the biological process itself, or to the 
term describing the regulation of that biological process, based on 
the published experimental data. Therefore, to get a full picture of 
the genes involved in a process, including the genes that regulate the 
process, it is necessary to combine the genes annotated to GO terms 
describing both the 'process' and the 'regulation of the process'. 
For example, 10 zebrafish, 12 human and 14 mouse genes within 
the 'jogging ortholog' gene lists are annotated to either 'cell pro- 
jection organization' or 'regulation of cell projection organization' 
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Figure 1. View of relationships between enriched terms from the cellular component organization or biogenesis ontology. The 

GONUTs view of relationships between enriched terms from the cellular component organization or biogenesis ontology^^. The grey arrows 
are used where a term has an 'is a' relationship to its parent term, the blue arrows indicate a 'part of relationship. The bars below each GO 
term indicates which of these terms are enriched in the zebrafish 'jogging' gene list, and the human and mouse 'jogging ortholog' gene lists. 



Page 7 of 17 



FIOOOResearch 2014, 2:242 Last updated: 19 FEB 2014 



(or children of these terms). This represents 33%, 40% and 47% of 
these zebrafish, human and mouse 'jogging' gene Hsts respectively, 
indicating that the process of cell projection organization is an 
important function for this group of genes. In addition, many of the 
'jogging' and 'jogging ortholog' genes annotated to 'cell projection 
organization' terms have also been annotated to the cellular compo- 
nent term 'cell projection' (7, 11, 13, genes in zebrafish, human and 
mouse, respectively, see Data File). The enrichment of the biological 
process term 'cell projection organization' and cellular component 
term 'cell projection' within these gene lists is consistent with the key 
role of cilia located in the node/Kuppfer's vesicle to determine heart 
left/right asymmetry in all three species. 

Enrichment of cell migration terms. Cell migration also plays a 
key role in the establishment of the heart cone, heart jogging and 
heart looping'^' and enrichment of the GO term 'cell migration' 
is seen in the 'jogging' gene lists of all three species (Table 2 and 
Biological_process_summary.csv in Data File). Lenhart et al (20 13)^^ 
identified FoxHl, spaw, Bmp4, Lejtyl and Has2 as essential to the 
asymmetric cell migration that leads to heart jogging. However, our 
literature review suggests that some genes may have functions in 
both cilia assembly, within the Kuppfer's vesicle, and cell migration. 
For example, thymocytes from Foxjl transgenic mice display defec- 
tive migration^ ^ whereas Fox/7 -null mice are defective in ciliogen- 
esis"^^. Similarly, in zebrafish, Fzd2 has been shown to play a role in 
cilium assembly-- as well as pancreatic insulin-cell migration^ '. Con- 
sequently, further investigations into the role of these genes in heart 
jogging cell migration may provide further insight into this process. 

Co-annotation of heart development associated genes 

In order to investigate the contribution of individual genes in the 
multiple processes associated with early heart development we cre- 
ated human and mouse heart development gene lists and examined 
the associated GO biological processes terms. A list of 103 mouse 
genes with roles in early heart developmental processes was created 
by merging the three gene lists created using the Mouse Genome 
Informatics phenotype browser, the QuickGO browser as well as 
the 'jogging ortholog' gene list (Mousegenelist.csv in Data File). 

GO captures a range of biological processes that a single gene 
is involved in. By comparing the overlap between the GO terms 
associated with specific gene lists it is possible to see what cel- 
lular mechanisms are likely to be contributing to the various heart 
developmental processes. Using the QuickGO browser, genes in 
the zebrafish 'heart jogging' gene list, which were associated with 
the GO terms 'heart looping', 'signal transduction', 'cell migration' 
and 'cell projection organization' (and all child terms, including 
'regulation' terms), were downloaded, as well as the genes associ- 
ated with these terms that were also present in the mouse 'early 
heart development' gene list (Mousegenelist.csv in Data File). 

In the zebrafish 'heart jogging' gene list a similar proportion of 
the genes have the potential to play a role in cell projection or- 
ganisation (10 genes), cell migration (8 genes) and signal transduc- 
tion (13 genes) (Figure 2A). In the list of 103 mouse genes that 
are associated with early heart development, either by phenotype, 
annotation or homology to the zebrafish 'heart jogging' gene list. 



82 have been annotated to the GO term heart looping. In contrast to 
the zebrafish 'jogging' gene list, signal transduction appears to play 
a major role in the mouse early heart development, with 27 genes 
associated with both signal transduction and heart looping, whereas 
only 18 and 9 genes, respectively, are associated with cell migration 
and cell projection organization (Figure 2B). These results fit well 
with what is known about these gene lists. The zebrafish 'jogging' 
gene list defines a group of genes whose functions are required 
very early in heart development, when the role of cilia in symmetry 
breaking initiates the heart jogging process. Whereas, in the mouse 
'early heart development' gene list the genes included have roles in 
heart looping, which is developmentally later event than heart jog- 
ging. Therefore, although the initial events associated with break- 
ing of left-right symmetry are represented within this gene list, the 
genes involved in the later process of ensuring the complex looping 
of the heart tube, through controlled signalling and cell migration, 
contribute to a large proportion of this list. 

Human disease phenotypes associated with the 'jogging 
ortholog' genes 

While annotating the 26 human 'jogging ortholog' genes we noticed 
that almost half of these genes have not been associated with a spe- 
cific disease phenotype (Table 3). However, of the 26 genes exam- 
ined, mutations in 14 had been associated with a disease phenotype, 
a fifth of which were ciliopathies. Dextrocardia or situs inversus 
totalis (reversal or mirroring of the major visceral organs) was 
associated with 6 of the human 'jogging ortholog' genes. Loca- 
tion of the heart on the right side (rather than the left) is generally 
agreed to be the result of left-handed, instead of right-handed loop- 
ing of the heart tube in early embryogenesis^. The association of 
these 'jogging ortholog' genes with heart looping defects confirm 
that there is conserved functional homology between at least some 
of these orthologous zebrafish and human genes in the very early 
stages of heart development, which lead to the initial heart asym- 
metry. All four of ciliopathy-associated 'jogging orthologs' were 
also described as associated with situs inversus totalis, confirming 
the conserved role of these genes in the cilia within the symmetry 
determining left-right organizer. 



Significantly enriclied GO terms in heart jogging genes in 
zebrafisli and tlieir liuman and mouse ortliologs FIOOOResearcli 

5 Data Files 

http://dx.doi.org/10.6084/m9.figshare.844630 



Discussion 

We have used GO to annotate the key genes involved in zebrafish 
heart jogging and their human and mouse orthologs. Heart jogging 
is not a process that is thought to occur in mammals. However, these 
genes are conserved between species and play essential roles in 
many developmental processes. The information available about 
these genes in several diverse species can be used to shed light 
on the roles of these genes and possible mechanisms in heart jog- 
ging and other heart developmental processes. Our analyses are in 
agreement with the well described essential role of cilia in early 
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Figure 2. Venn diagrams describing the multiple roles of genes associated with heart development. Venn diagrams showing the overlap 
between the GO terms associated with A) the zebrafish 'heart jogging' gene list (30 genes) and B) the mouse combined heart development 
gene list (103 genes). 



development^'^'^^ with a third of the zebrafish 'heart jogging' genes 
associated with the biological process 'cell projection organization' 
(Table 2). 

However, it is also important to recognise that although there is 
considerable evidence for conserved mechanisms of heart devel- 
opment across vertebrates there are also areas of divergences^. For 
example, in the mouse, zebrafish and Xenopus the rotation of cilia 
is responsible for the early asymmetric gene expression pattern 
around the left-right organizer, whereas cilia do not play a role in 
symmetry breaking in the chicken or pig^"^. 

The early phases of heart development are particularly difficult 
to study in mammals, however various approaches are enabling 
progress in this area^-'^'^^'^^ and using phenotype, annotation and 
orthology data we have created a list of 103 genes with a puta- 
tive role in early mouse heart developmental processes. Further- 
more, the phenotypes associated with experimentally generated 
mutant mice provide further clues to the likely role of these genes in 
human heart development; the genes associated with situs inversus 
totalis phenotypes are most likely to have functional roles within the 
node. Conversely, genes not associated with situs inversus totalis 
but associated with an abnormal direction of heart looping, dex- 
trocardia or mesocardia are likely to be involved in the response of 
the embryonic heart tube to the left/right asymmetry signals. This 
is not a completely reliable interpretation, for example mutations 
in the transcription factor Pitx2 lead to mice with situs inversus 
totalis, however, Pitx2 is expressed in the left lateral plate and 
its continued asymmetric expression is necessary for asymmetric 
morphogenesis of most visceral organs^^. The mouse knockout 
consortia data^^ will continue to help with the identification of 
additional early heart development genes, and informed interpreta- 
tion of these phenotypes will make it possible to separate those 



genes likely to be associated with the node from those with func- 
tions within the heart tube. 

In humans, defects in early heart development are likely to result 
in spontaneous abortion and therefore many genes required for 
early heart development will go undetected^l Consequently, human 
embryos with heart defects, which develop to full term, represent 
the less severe end of the spectrum. Mutations in several human 
genes have now been identified as causative of abnormal heart loop- 
ing, such as ACVR2B, LEFTY2, GJAl and ZIC3'''-'\ and some of the 
'jogging ortholog' genes (CCDC103, CCDC40, DNAAFl, LRRC6, 
NPHP3 and PKD2) are also associated with heart looping defects. 
Thus providing evidence to support an involvement of these genes 
in left-right asymmetry determination in the heart. Furthermore, 
mutations in some of the 'jogging ortholog' human genes, FOXHl 
and PTPNll, are associated with heart septal defects in humans, 
which seems to imply that in individuals with these mutations early 
heart developmental processes have proceeded normally, suggest- 
ing that, contrary to their role in zebrafish, these genes may not be 
involved in the early stages of human heart development. However, 
there are other possible reasons why there is a poor association of 
heart defects with the 'jogging ortholog' gene list. This may simply 
be due to the lack of detection of situs inversus totalis^\ or reflect 
a redundancy in gene function, or it may be that the majority of 
mutations in these genes are simply not detected in humans because 
they are masked by first trimester spontaneous abortions, which are 
known to have a high level of heart defects'^l 

The impact of lethal mutations on detection of genes associated 
with heart development would suggest that mutations in these 
genes would only be detected in individuals with mutations with 
relatively minor impact on gene function. This idea is supported by 
the recent identification of multiple 'minor' heterozygous mutations 
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Table 3. Diseases associated with the human 'jogging ortholog' genes. The associated diseases are described in the 
listed publications. 



Human gene symbol 
(protein ID) 


Heart relevant phenotype 


Other associated phenotypes 


ACVRL1 (P37023) 




Hereditary haemorrhagic telangiectasia type 2 
(HHT2)5^ HHT2 with pulmonary hypertension^^ 


APC {P25054) 




Familial adenomatous polyposis coli-l*^^ 


aMP4(P12644) 




Microphthalmia, syndromic 6^^ orofacial cleft 11^^ 


BMP/ (P1 8075) 


BMPR2 (Q^3873) 




Pulmonary hypertension^'* 


CAMK2A (Q9UQIVI7) - 


CAMK2B (QA3554) 


CAMK2G (Q^3555) - 


CCDC103(Q8\\N40) 


Dextrocardia, situs inversus total is!^^ 


Ciliary dyskinesia, primary, 17^^ 


CCDC40 {Q4G0X9) 


Situs inversus totaiis^^ 


Ciliary dyskinesia, primary, 15^^ Kartagener's 
Syndrome^^ 


COSL (075128) 


D>!\A/D5(Q8N907) 


DNAAF1 (Q8NEP3) 


Situs inversus totalis^^'^^ 


Ciliary dyskinesia, primary, 13^^'^^ 


FGF/?2(P21802) 




Several craniosynostosis^°'^\ see OMIM for more 
information 


FOXH1 (075593) 


Ventricular septal defect^^ transposition 
of the great arteries^^ 




FOXJ1 (092949) 


FZD2 (01 4332) 


GS/C3S(P49841) 


(09281 9) 


Z./?/?C6 (086X45) 


Situs inversus total is'^^ 


Ciliary dyskinesia, primary, 19, Kartagener's 
Syndrome^^ 


NIPBL (06KC79) 


Cardiac septal defects (not confirmed as 
associated with NIPBL nnutations)^^ 


Cornelia de Lange syndrome V^'^^ 


NKD1 (0969G9) 




Colorectal adenocarcinoma^^ 


NPHP3 (Q7Z494) 


Situs inversus total Is'^^ 


nephronophthisis type Meckel syndrome type 
renal-hepatic-pancreatic dysplasia^^'^^ 


PACD2(013563) 


Dextrocardia, situs inversus totall^^ 


Polycystic kidney disease 2^^'^° 


PrPA/7 7 (006124) 


atrioventricular canal defects^^ 


juvenile myelomonocytic leukemia^^, LEOPARD 
syndrome^\ Noonan syndrome^^'^^ 


RCSD1 (06JBY9) 



within a functional network in three patients with transposition of 
the great arteries. All of these genes either participate or cooper- 
ate within the Nodal signaling path way and the carriers of sin- 
gle mutations exhibit no heart or laterality defects. The impact of 
'minor' mutations, such as these, may explain the contribution of 
'genetic modifiers' to congenital heart defects with variable pen- 
etrance within a family or may suggest a polygenic basis for 
some of these diseases^^'. This is supported by model organism 
data, which provides evidence of multigenic origins for congeni- 
tal heart disease^^. However, model organisms are rarely used to 
examine the impact of genetic modifiers on heart development, as the 
majority of model organisms are inbred and examination of muta- 
tions leading to 'minor' phenotypic variations is often not viewed 
with the same level of interest as the more extreme heart develop- 
ment defects. 



Next Generation Sequencing (NGS) has the potential to identity 
many more instances of multiple mutations in genes which are 
functionally linked through a specific pathway. However, teasing 
out which gene mutations are contributing to a disease, as a genetic 
modifier or as the causative gene variant, and which are not involved 
in the disease, is likely to take considerable time. Gene Ontology, 
KEGG and Reactome pathways, along with protein interaction 
networks have the potential to inform the process of identifying 
genetic variants associated with heart defect risk through the identi- 
fication of pathways and networks which are common to the genes 
associated with the risk gene variants. Consequently, interpretation 
of NGS data will be greatly improved with full annotation of the 
candidate genes involved. The identification of these risk gene vari- 
ants is likely to be of considerable value to those patients seeking 
prenatal diagnosis. In addition, the identification of more genes 
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associated with heart defects will also help clarify the conserved 
and divergent heart development pathways that exist between humans 
and key model organisms. 

Conclusions 

This study demonstrates that full annotation, using GO, of a set of 
genes known to be associated with early stages of heart develop- 
ment in zebrafish can be used to confirm functional conservation 
of the role of these genes in a variety of developmental processes. 
While this study supports the assertion of gene function based on 
orthology between genes, it also identifies that for some genes there 
is no direct evidence for their conserved involvement in specific 
developmental processes through evolution. Consequently, for evo- 
lutionary studies, manual annotation of the genome of individual 
species will be necessary to enable a bioinformatics approach to 
investigating the evolution of developmental processes. 
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The title and abstract accurately describes the content of the manuscript. 
Method: 

The analysis described here is based on the selected 30 genes from literature that are associated with the 
GO term 'heart jogging'. According to the description by the authors these genes are involved in zebrafish 
heart jogging, but a better definition would be that these genes affect heart jogging (either direct or 
indirect). Indeed some of the genes that were included play a role in cilia function and establishment of 
the left-right axis. 

Many more genes have been identified in zebrafish that affect left-right patterning and thus 'heart jogging'. 
It remains unclear why these were not included in the search. The observation that many zebrafish 
left-right genes are not associated with the GO term 'cardiac jogging' could be due to annotation issues 
(e.g. not every authors uses the term 'jogging' for heart laterality defects in zebrafish embryos). This 
limitation in the design of this study should be discussed more thoroughly. 

In addition the authors could make some conclusions about the usefulness of the GO term 'heart jogging' 
. Although heart jogging is a process specific to zebrafish it is controlled by a conserved left-right 
patterning mechanism. However the GO term 'cardiac jogging' is associated with genes that control heart 
morphogenesis and left-right patterning. 
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1 Comment 



Author Response 

Ruth Lovering, University College London, UK 
Posted: 19 Feb 2014 

Dear Jeroen, 

Thank you for your comments about our paper. 

We have updated the methods, as suggested, to: A list of 30 zebrafish genes that affect heart 
jogging... 

As already described in the methods we only included genes in this list where there was 
experimental evidence in ZFIN database which confirmed that these genes had an impact on heart 
jogging in zebrafish. Many genes are likely to be missing from this list because the heart jogging 
process is often not studied in zebrafish carrying mutations in relevant genes. 

We have added the following statement to the methods: 

This list does not represent all genes which play a role in heart jogging, as the process of heart 
jogging is not always studied in zebrafish carrying mutations in relevant genes. 

We have not provided a detailed discussion about the GO term 'heart jogging' as we do not want to 
describe the detail of GO transitivity any further here. 

Regards 

Ruth 

Competing Interests: No competing interests were disclosed. 
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The title is fine, but it would be better if it summarized findings rather than summarized the 
procedures. The abstract is an adequate summary of the article. 



Required revisions: 

• Methods- Generation of the list of zebrafish jogging genes: "The search identified a further 23 
zebrafish genes, however manuai review of these pubiications ied to 5 being disregarded, as the 
evidence for an invoivement in heart jogging was not strong enough " The phrase, "was not strong 
enough" does not provide sufficient detail to make construction of this list reproducible. The details 
should be provided. 

• Discussion- paragraph 4: "oniya few of the 'jogging orthoiog' genes, CCDC103, CCDC40, 
DNAAF1, LRRC6, NPHP3, and PKD2, are associated with heart iooping defects, and thus provide 
evidence which suggests an invoivement of these genesis ieft-right asymmetry determination in the 
heart" First, making an assertion that genes are not associated with a particular process requires 
that high-powered studies were performed to reach a negative conclusion. Does the literature 
support this? Second, while it is true that eliminating one possibility increases the probability of 
other explanations, it is not very solid evidence for a particular explanation. The logic of the above 
statement should be better supported, or the statement should be removed. 

• Discussion- paragraph 4: "However, there are numerous other reasons why there is a poor 
association of heart defects with the jogging orthoiog' gene list." This should be, "...numerous 
other possible reasons ...". 

I have read this submission. I believe that I have an appropriate level of expertise to confirm that 
it is of an acceptable scientific standard. 

Competing Interests: No competing interests were disclosed. 



1 Comment 

Author Response 

Ruth Lovering, University College London, UK 
Posted: 19 Feb 2014 

Dear Vincent, 

Thank you for your helpful comments. We also recognise that since the original search was 
conducted that there are now an additional 7 other zebrafish genes (Celfl , Cdc14aa, Cdc14b, 
Dmrt2a, Enpp2, Grem2, IparS) with experimental evidence supporting their role in heart jogging ( 
Matsui, T. etal., 2012; Clement, A. etai., 2012; Lai, S.L etai., 2012; Muller, I.I. etal., 2013) 

As this comment will remain associated with this manuscript we will not add a comment about 
these additional genes to the manuscript itself, but we will revise the manuscript as follows: 

Methods - Generation of the list of zebrafish jogging genes section: 
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To make it clearer how this list of zebrafish jogging genes was generated we will replace part of 
this section with the following: 



"/A list of 30 zebrafish tieart jogging genes was compiled using a variety of approaches. Twelve 
zebrafish proteins were identified as they were already annotated to the 'heart jogging' GO terms, 
the remaining 18 proteins were then identified using the ZFIN (http://zfin.org/) Site Search, with 
the search phrase 'heart jogging', and filtering using the 'Expression/Phenotypes' category. This 
search retrieves figures from papers that have 'heart jogging' in the figure legend, and thus are 
likely to be describing specific zebrafish genes (and proteins) involved in this process. Many of 
these genes had not not yet been curated with GO terms. Each of the papers identified in this way 
were reviewed; of the 23 zebrafish genes identified in these papers five (Bmprlaa, Tbx1, 
unm_hu1 19, unm_hu202, unm_hu304) were eliminated, as none of these papers provided 
experimental evidence for the involvement of these genes in heart jogging. " 

In Discussion - paragraph 4: 

We did not intend to make a negative conclusion here, this statement was making a positive 
statement that some of the 'jogging ortholog' genes, CCDC103, CCDC40, DNAAF1 , LRRC6, 
NPHP3, and PKD2, are associated with heart looping defects. And as included in the comment 
below we do state that 'there are numerous other reasons why there is a poor association of heart 
defects with the 'jogging ortholog' gene list'. To make this statement less controversial, we will 
modify it as follows: 

"Mutations in several human genes have now been identified as causative of abnormal heart 
looping, such asACVR2B, LEFTY2, GJA1 and ZIC3(49-52), and some of the jogging ortholog' 
genes (CCDC103, CCDC40, DNAAF1, LRRC6, NPHP3 and PKD2) are also associated with heart 
looping defects. Thus providing evidence to support an involvement of these genes in left-right 
asymmetry determination in the heart. " 

nH 

Discussion- paragraph 4 (2 comment): 

As suggested we will revise this to "However, there are numerous other possible reasons why 
there is a poor association of heart defects with the jogging ortholog' gene list. " 
Competing Interests: No competing interests were disclosed. 
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